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METHOD OF GENERATING A MULTIFIDELITY MODEL OF A SYSTEM 

Field of the Invention 

5 The present invention relates to a method of generating a 

multif idelity model of a system. 

Background 

10 High cost/high fidelity models are used in relation to 

many engineering design problems. For example, a typical high 
fidelity model might be a finite element (FE) model having a 
large number of elements which allow an engineering system' s 
behaviour to be characterised to a high level of accuracy. 

15 Design optimisation of the system using the FE model may be 
desirable, but may also require many rounds of analysis. This 
can entail high computational burdens which can make the 
optimisation process impractical. 

For this reason, more approximate or low fidelity models 

20 for systems may be sought. Such low fidelity models may be 
global or local. Global low fidelity models try to capture the 
behaviour of an objective function and/or constraints over the 
entire domain of interest. Local models are defined in a 
specific region of the design space. 

25 A common way of tackling the problem of expensive 

function/model optimization is through the use of 
approximations to the expensive function/model. Response 
surface methods (see for example Myers and Montgomery, (1995) ) 
seek polynomial approximations to the function/model. These 

30 approximations, once constructed, define a low fidelity model 
which can provide a cheap means of approximating the original 
expensive function/model . 

An approach based on kriging is described in Jones et al . 
(1998) . Their algorithm builds a global approximation using a 

35 kriging model and then performs optimisation using this model. 
Another possible approximation strategy involves the use of 



neural networks to build a global approximation. However, both 
these approaches are, in effect, methods of curve fitting. 
They build low cost models using data points from the high cost 
model and do not attempt to incorporate any further information 
5 on the problem in hand. 

One concern with these approaches is the level of accuracy 
of the resulting approximation arising from the inevitably 
limited quantities of training data used. As a result, there 
has been an interest in the use of low fidelity models to 

10 sample parameter space at points that are not sampled during 
expensive function/model evaluation. These low fidelity 
models, while being less accurate than the original model, are 
generally much cheaper to compute. As an example, in an FE 
analysis the cheap model may use a coarser mesh than the 

15 original expensive model, while during a computational fluid 
dynamics (CFD) analysis a panel code may replace an expensive 
Euler analysis. Combining a low fidelity models with a more 
accurate but expensive result in a useful compromise between 
accuracy and computational cost. 

20 Perhaps the simplest way of utilizing low fidelity 

information is to consider the differences between the high and 
low fidelity models. Thus Watson and Gupta (1996) used a 
neural network to model differences between the two models and 
applied the approach to microwave circuit design. Their 

25 technique uses a design of experiments (DOE) methodology to 
identify configurations of the input variables for which to run 
the high fidelity model. The low fidelity model is then run at 
these design points, providing information on the difference 
between the two models. An approximation to the high fidelity 

30 model can then be constructed using the low fidelity model and 
an approximation to the difference. 

An alternative to this approach is to model the ratio of 
high and low fidelity models. For example, Haftka (1991) and 
Chang et al. (1993) calculate the ratio and derivatives at one 

35 point in order to provide a linear approximation to the ratio 
at other points in the design space. The approach is applied 
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to a wing-box model of a high speed civil transport aircraft. 
More recently - the approach has been applied using polynomial 
models to approximate the ratio. The approach, termed a 
"correction response surface" model, has been applied to 
5 aerodynamic drag approximation by Hutchinson et al. (1994) as 
well as structural problems, for example, see Vitali et al. 
(1999) . 

Recently Wang and Zhang (1997) developed a knowledge-based 
neural network model for microwave design. This approach 

10 included problem specific knowledge in the form of generic 
empirical functions inside the neural network. However, this 
approach has limitations applicability when empirical functions 
representing knowledge are unavailable. 

Furthermore, a disadvantage associated with neural 

15 network-based approaches are the difficulties of identifying 
the optimal neural network architecture and properly training 
the network. 

Summary of the Invention 

20 

Thus, in general terms a first aspect of the present 
invention provides a method of generating a multif idelity model 
of a system in which a kriging model is used to compensate for 
discrepancies between high and low fidelity models of the 
25 system. 

More specifically, the first aspect of the present 
invention provides a method of generating a mult if idelity model 
of a system, comprising the steps of: 

(a) obtaining training data from a high fidelity model of 
30 the system; 

(b) providing a low fidelity model of the system; 

(c) providing a kriging model to compensate for 
discrepancies between the high and low fidelity models; 

(d) adjusting the kriging model to maximise the likelihood 
35 of said training data when the low fidelity model, compensated 

by the kriging model, is used to model the system; and 
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(e) generating a multif idelity model of the system based 
on the low fidelity model when compensated by the adjusted 
kriging model. 

Typically, each training datum comprises (i) a plurality 
5 of input parameters for the high fidelity model and (ii) 
corresponding one or more output parameters which result from 
running the high fidelity model with these input parameters. 
Preferably the data points are selected in order to sample 
"design space" representatively. 

10 We have found that, by using a kriging model to compensate 

for discrepancies between the high and low fidelity models, the 
training of the mult if idelity model can be greatly simplified 
compared to models based on neural networks. Furthermore, this 
advantage is obtainable without significantly compromising the 

15 accuracy of the model. 

The kriging model may compensate for discrepancies between 
the high and low fidelity models by modelling the differences 
between the output parameters of the high fidelity model and 
the corresponding output parameters of the low fidelity model. 

20 Alternatively, the kriging model may compensate for 
discrepancies by modelling the ratios between the output 
parameters of the high fidelity model and the corresponding 
output parameters of the low fidelity model. Other 
compensation schemes known to the skilled person may also be 

25 adopted. 

In general terms, a further aspect of the present 
invention provides a method of generating a multif idelity model 
of a system comprising providing a low fidelity model of the 
system which has adjustable weightings for respective input 

30 parameters to the low fidelity model, and adjusting the 
weightings to maximise the likelihood of training data obtained 
from a high fidelity model. 

We have found that by using such an approach, significant 
characteristics of the behaviour of high fidelity models can be 

35 captured directly within the low fidelity model. This can lead 
to overall improvements in modelling accuracy. 
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More specif ically, the second aspect of the present 
invention provides a method of generating a multif idelity model 
of a system, comprising the steps of: 

(a) obtaining training data from a high fidelity model of 
5 the system; 

(b) providing a low fidelity model of the system, the low 
fidelity model having adjustable weightings for respective 
input parameters to the low fidelity model; 

(c) providing a compensation model to compensate for 
10 discrepancies between the high and low fidelity models; 

(d) adjusting the compensation model and the weightings to 
optimise the correlation of the low fidelity model, when 
compensated by the compensation model, with said training data; 
and 

15 (e) generating a multif idelity model of the system based 

on the adjusted low fidelity model when compensated by the 

adjusted compensation model. 

For example, the weightings may comprise shifts in the 

values of the respective input parameters. Alternatively, or 
20 additionally, the weightings may comprise scalings in the 

values of the respective input parameters. 

Preferably, the compensation model is a kriging model, in 

which case the correlation optimisation in step (d) is 

effectively a likelihood maximisation. In this way, advantages 
25 of the methods of both aspects of the invention may be 

combined. However, the compensation model may be e.g. a neural 

network . 

Typically, the system of the methods of either of the 
previous aspects comprises a gas turbine or a part of a gas 
30 turbine. The models may be of stress, strain, fluid flow, 
thermal etc. fields. 

Further aspects of the invention provide (i) computer 
readable program code for implementing the method of either of 
the previous aspects, (ii) computer readable media carrying 
35 program code for implementing the method of either of the 
previous aspects, and (iii) a computer system operatively 
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configured to implement the method of either of the previous 
aspects . 

As used herein, "computer readable media" refers to any 
medium or media which can be read and accessed directly by a 
5 computer. Such media include, but are not limited to: magnetic 
storage media such as floppy discs, hard disc storage medium 
and magnetic tape; optical storage media such as optical discs 
or CD-ROM; electrical storage media such as RAM and ROM; and 
hybrids of these categories such as magnetic/optical storage 
10 media. 

As used herein, u a computer system" refers to any hardware 
means, software means and data storage means used to perform a 
computer-implemented method of the present invention. The 
minimum hardware means of such a computer system typically 

15 comprises a central processing unit (CPU) , input means, output 
means and data storage means. The data storage means may be 
RAM or means for accessing computer readable media. An example 
of such a system is a microcomputer workstation available from 
e.g. Silicon Graphics Incorporated and Sun Microsystems running 

20 Unix based, Windows NT or IBM OS/2 operating systems. 

For example, a computer system for implementing the method 
of the first aspect of the invention may comprise: 

a data storage device or devices for storing (a) training 
data obtained from a high fidelity model of the system, (b) a 

25 low fidelity model of the system, and (c) a kriging model to 
compensate for discrepancies between the high and low fidelity 
models, and 

a processor for (a) adjusting the kriging model to 
maximise the likelihood of said training data when the low 
3 0 fidelity model, compensated by the kriging model, is used to 
model the system, and (b) generating a multif idelity model of 
the system based on the low fidelity model when compensated by 
the adjusted kriging model. 

A computer system for implementing the method of the 
35 second aspect of the invention may comprise: 

a data storage device or devices for storing (a) training 
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data obtained from a high fidelity model of the system, (b) a 
low fidelity model of the system, the low fidelity model having 
adjustable weightings for respective input parameters to the 
low fidelity model, and (c) a compensation model to compensate 
5 for discrepancies between the high and low fidelity models, and 
a processor for (a) adjusting the compensation model and 
the weightings to optimise the correlation of the low fidelity 
model, when compensated by the compensation model, with said 
training data, and (b) generating a multif idelity model of the 

10 system based on the adjusted low fidelity model when 
compensated by the adjusted compensation model. 

Further aspects of the invention provide (i) computer 
readable program code for implementing a multif idelity model 
generated using the method of any one of the previous aspects, 

15 (ii) computer readable media carrying program code for 
implementing a mult if idelity model generated using the method 
of any one of the previous aspects, and (iii) a computer system 
operatively configured to implement a mult if idelity model 
generated using the method of any one of the previous aspects. 

20 

Brief Description of the Drawings 

Examples of the present invention will now be described in 
more detail with reference to the accompanying drawings, in 
25 which: 

Fig. 1 shows the overall architecture of a multif idelity 
model based on a neural network and a low fidelity model, 

Fig. 2 shows in more detail the multif idelity model of 
Fig. 1, 

30 Fig. 3 shows the overall architecture of a multif idelity 

model based on a kriging model and a low fidelity model, 

Fig. 4 shows schematically an elastic beam structure used 
in Example 1, 

Fig. 5 shows schematically two two-dimensional problems 
35 (Problems A and B) based on the structure of Fig. 4, 

Fig. 6 shows schematically a four-dimensional problem 
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based on the structure of Fig. 4, 

Figs. 7a-c show respectively objectives and constraint 
boundaries for Cheap, Expensive and KBNN models used to solve 
problem A of Fig. 5, 
5 Figs. 8a-c show respectively objectives and constraint 

boundaries for Cheap, Expensive and KBK models used to solve 
problem B of Fig. 5, and 

Figs. 9a and b show respectively the finite elements used 
to generate low and high fidelity models of a gas turbine 
10 engine component. 

Detailed Description 

The methods proposed by the present invention are based on 
15 techniques which might generally be referred to as response 
surface modelling using multif idelity optimisation. 

It is useful to consider first, therefore, a more 
conventional approach to multif idelity optimisation, before 
considering the application of the this technique to the 
20 problems that the present invention is aimed at addressing. 

Multif idelity Modelling Using Artificial Neural Networks 

An artificial neural network (see, for example, White et 
25 al. (1992) consists of a set of simple processing units which 
communicate by sending signals to each other over a large 
number of weighted connections. The network is trained using 
training data obtained from selective calls to the high 
fidelity model. The trained model can then be used as a 
30 surrogate to the original expensive code. However , when 
training data is limited due to the prohibitive cost of 
generating sufficient learning samples, then such 
approximations can be inadequate. The use of mult if idelity 
models can help to overcome such problems. 
35 Watson and Gupta (1996) successfully applied neural 

networks to mult if idelity modelling. The basic idea is still 



A 



9 

to approximate a function f e which is expensive to compute so 
that very few training data are available. However, the 
approximation is improved by using a cheap function f a which 
approximates f e and is less costly to compute but lacks 
5 accuracy. This cheaper function contains useful information 
about the behaviour of f e in regions where f e is not sampled. 
The difference between the two models 

d = f e - f a (1) 
is considered. This is sampled at various locations x if 
10 i = 1, 2, ... , N and provides training data d±) , i = 1, 2, ... , N 

which are used to train the neural network. Thus the N th 
training datum comprises (i) a vector x w whose components are a 
plurality of input parameters and (ii) the difference between 
the expensive and cheaper functions when these functions 
15 receive the input parameters. After training, the network 

provides a cheaper approximation d to d throughout the whole 
domain. As a result, 

f a +d*f e (2) 

can be used as a surrogate repetitively at little cost. This 
20 is clearly useful when we wish to optimise the expensive model, 

as we can optimise f a + d instead of f e . 

Another approach is to model the ratio r = fjf^ , and then 
consider rf 3 as a surrogate for f e . 

Whichever approach is used, the trained network 
25 effectively acts as a compensation model which compensates for 
discrepancies (i.e. d or r) between the expensive function 
(high fidelity model) and cheap function (low fidelity model) . 

Furthermore, although we have described the provision of 
data (Xi, di) and discussed the training of a network to model 
30 d, the skilled person would recognise that this is essentially 
equivalent to providing data (Xi, f e ,i) and training the network 
so that f a as compensated by the network approximates f e . 

Mult 1 fidelity Modelling Using a Neural Network and a Low 
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Fidelity Model 
Model Structure 

5 In one of its aspects the present invention proposes a 

modified approach in which a low fidelity model which has 
adjustable weightings for respective model input parameters is 
used to generate a multif idelity model of the system. 

This low fidelity model , f a , shares physics with the high 
10 fidelity expensive model, f e , but differs in details. f a 
defines our prior knowledge about the system being modelled and 
gives us some information as to the behaviour of f e away from 
expensively sampled points. If there is reasonable correlation 
between the models, this approach is likely to provide a 
15 relatively accurate prediction of system behaviour, 
particularly at extrapolated points. 

Fig. 1 shows the overall architecture of the multif idelity 
model . 

In more detail and with reference to Fig. 2, the 
20 multif idelity model comprises a network with input layer X, 
knowledge layer Z, boundary layer B, region layer R, normalised 
region layer R' and output layer Y. The low fidelity model, 
fj, appears in the knowledge layer Z. The outputs of the 
knowledge layer Z and neural layers R' are weighted and merged 
25 by multiplication. In our experience this seems to perform 
better than using a multilayer perceptron with a single hidden 
layer . 

Layers X, B, R, R f and y effectively form a neural network 
-hat serves as a compensation model to compensate for 
30 discrepancies between the low fidelity model and a high 
fidelity model (results from which are used to train the 
r.ultif idelity model) . 

The input layer accepts inputs x. Details of the 
knowledge layer, boundary layer, region layer, normalised 
35 region layer and output layer follow in equations (3) -(8). We 
consider the problem with input vector x (N x x 1), output y 
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(approximating the high fidelity model f e (x)) and knowledge z 
(see equation (3)). Both y and z could be vectors, but we will 
consider the case of a single output only. 

The "empirical knowledge" is provided by the cheap (low 
fidelity) model. The input x is weighted so that the knowledge 
vector is calculated from the low fidelity model evaluation as 

z = f a (w L x + w 2 ), (3) 

where W x = diag {wj, w 2 2 , ... , w* x } is a diagonal matrix of weights 
for scaling and w 2 is a vector of weights for a shift of the 
input arguments. This procedure can easily cope with 
situations where the cheap and expensive models differ only by 
a scaling or a shift in the inputs. The weights in (3) are 
adjustable parameters to be determined when training the 
network. Since the low fidelity model should be a reasonable 
approximation to the high fidelity model the matrix W x should 
be close to the identity matrix and w 2 should be close to the 
zero vector. 

In the boundary layer, the neuron i is calculated as 

jb, = B(x, vj, i = 1, ... ,N b . (4) 

This layer could also incorporate function knowledge as in 
Wang and Zhang (1997). However, we take it simply as the inner 
product of x and Vi 

b, = xV , i = 1, ... , N b , (5) 

where v 2 are a set of free parameters that will be determined 
during the training process. 

Using a sigmoid function V the region layer neurons are 
constructed from boundary neurons as 

r i = ft M<2iA + e iJ' i = 1/ 2,...,N e . (6) 

Here a±j and Q±j are respectively scaling and bias parameters 
(i.e. adjustable weightings). 

The normalising layer normalises the outputs of the region 
layer, that is, 
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Y Nl r. ' 



i = 1, ... , N r , = W r . (7) 



Finally, the output is given by 



+ P 0 . (8) 



Note that the merging of the knowledge layer and the 
neural layer has been performed using multiplication. This is 
consistent with the approach of Wang and Zhang (1997) . Clearly 
other ways of combining this information (e.g. addition) exist 
and could be considered. In this way simple relationships 
between the high and low fidelity models can be exploited. The 
skilled person would also be able to extend the method to 
problems with multiple outputs, along the lines of Wang and 
Zhang (1997) . 



Training the Model 



Let y represent the output from the neural network and the 
low fidelity model and f e represent the high fidelity model 
output. The neural network and the low fidelity model learn 
from the training data (x i ,f f (x i )) / i = 1, 2, ... , M data . The 
trainable parameters are the knowledge weights W x and w 2 , the 
boundary layer weights v ir i = 1, 2, ... , N to , the scaling 

parameters a i7 and Q ljf i = 1, 2,...,N r , j = 1, 2,...,N b , pi, Por 
and p k , k = 1, 2, .,. , N z , . For the 2D example described below, 

this requires a total of 33 parameters to be determined during 
training, the majority of these deriving from the neural 
network structure . 

The undetermined parameters are chosen to minimize the 
difference (i.e. optimise the correlation) between the neural 
network and the low fidelity model outputs y and the actual 
training outputs f e in the least square sense. Thus we 
minimise 
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with respect to these parameters. 

The derivatives of E with respect to the unknown 
parameters are given in Wang and Zhang (1997) and can be used 
5 in gradient descent minimisation. Updating the weights in this 
case requires modifying the traditional backpropagat ion 
algorithm (Rumelhart et al. (1986)) slightly to cope with the 
different network topology. Of course, other optimization 
strategies such as conjugate gradient minimization (Press et 
10 al. (1992)) could be used to determine the weights. 

Multi fidelity Modelling Using a Kriging Model and a Low 
Fidelity Model 

15 Model Structure 

In another of its aspects the present invention proposes 
an approach in which a kriging model is used to compensate for 
discrepancies between high and low fidelity models of the 
20 system and thereby to generate a mult if idelity model of the 
system. 

The method of the previous section included cheap but low 
fidelity information along with expensive but high quality 
information in a neural network framework. We now turn, 
25 however, to the problem of replacing the neural network with a 
kriging model (see Jones et al. (1998) for a detailed 
description of the kriging method) . 

In typical approximation methods, the non-linear 
relationship between observations (responses) and independent 
30 variables is expressed as 

y = fix) (10) 
where y is the observed response, x is a vector of k 
independent variables 

X = [x x , X 2 , ... , K k ] (11) 

35 and f(x) is some unknown function. We define 
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y = f(x), (12) 

an approximation for y based on kriging. A brief description 
of its implementation now follows. We then modify this 
classical approach to incorporate knowledge that comes from a 
5 weighted low fidelity model. 

Given a set of N training data [x (1) , x (2) , ... , x (w) ] the kriging model 

can be used to make a prediction y = f(x) at untested points x 
in the design space. 
10 A correlation matrix of the training data 

R(x {i) , x (J) ) = exp[- d(x (i) , x (J) )] (13) 
is first sought where d is some distance measure. For example 

d(x^>, x«) = ± 9 h \tf - x«f (0, > 0, 1 < p h < 2) (14) 

where 6 h and p^, are some as yet undetermined parameters. 
15 When we wish to sample at a new point x, we form a vector 

of correlations between the new points and the training data 
r(x) = R(x, x U) ) = [r(x, x (1) ), ... , R(x, x ,N) )]. (15) 



The prediction is then given by 
20 y(x) = u + r T R _1 (y - lp). 

The mean and variance of the prediction are 

i'R" l y 



and 



U = _ 

1 J R 1 



o _ = (y - m) r R- x (y - m) 

N 



(16) 



(17) 



(18) 



25 respectively. 

The parameters Q„ and ph are determined by maximising the 

likelihood 

1 



(2n)%(o 2 )' y: |R|^ 



exp 



- (y - mfR-^Y ~ lu) 

2o 2 



(19) 



of the sample. 
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Our multif idelity modelling strategy using kriging again 
models the difference or the ratio of the high and low fidelity 
models at a given set of samples points. That is, we may 
approximate 

5 d = f e - f a (20) 

and add it to f a to approximate f e . Alternatively we may model 

r = fjf, (21) 
and then take rf a as a surrogate for f e . 

Furthermore, however, we use low fidelity cheap 

10 information along with the high quality expensive information 
within the approximating model itself. The cheap model, taken 
as prior knowledge, can be suitably weighted (as discussed 
above) to ensure best agreement between the two models of 
differing fidelity. The general structure of information flow 

15 is shown in Fig. 3. Note the similarity in the approach of the 
strategy of Fig. 1 and that of Fig. 3. The only significant 
departure is in the way parameters are extracted in the two 
cases - while the algorithm underlying the diagram in Fig. 1 
uses an artificial neural network technique, that of Fig. 3 

20 uses kriging. Thus in both cases the low fidelity model is an 
integral part of the approximation which has in parallel a 
compensation model which is either a neural network or a 
kriging model. This is in contrast to standard correction 
techniques where the low fidelity data do not inherently 

25 control the model training process. In mathematical terms, 
such standard techniques lack implicit influence over the 
likelihood function that needs to be maximized. 

Training the Model 

30 

In the present discussion, we consider modelling a response 
with a single output. The inputs x are fed into the knowledge 
layer (of the weighted low fidelity model) and into the kriging 
model. As shown in Fig. 3, the knowledge layer outputs the 
35 value z, using the weighted low fidelity method according to 
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equation (3). The kriging model inputs x and outputs some 
prediction, k. The output of the model can be defined in 
several ways e.g., based on addition z + k or multiplication 
z x K . These could also be weighted as in equation (8). In 
the examples that follow, we use multiplication. It may also 
be possible to let the model itself decide on the best 
functional form between the outputs of the knowledge layer z 
and the kriging prediction k by using further parameters. 

Referring to the neural network-based multif idelity model 
of Fig. 1, the undetermined weights were extracted by 
minimizing the sum of squares of differences (see equation 
(9)). However, this approach is not viable with a kriging 
model. This is because kriging models interpolate data 
exactly, thus the difference between the data and the model is 
zero for all the sampled points, whatever our choice of 
weights. Therefore, the free parameters of the model 
(including the weights in the low fidelity model) need to be 
determined by maximizing the likelihood function of the sample 
as given by equation (19) . This ensures that the best model out 
of all possible interpolating models is chosen. We have set 
Pi, = 2 and optimised with respect to Q h , h = 1, ... , k and the 
weights in the knowledge layer. This typically results in a 
reduced optimisation problem compared to the training of the 
neural network-based mult if idelity model. For the 2D example 
discussed in the following section, this requires just a six 
dimensional optimisation problem (compared with 33 for the 
neural network-based multif idelity model). Thus significant 
reductions in computational overheads can be achieved by 
adopting a kriging approach. Once again, the optimisation 
problem can be tackled using standard techniques, for example, 
conjugate gradients. 

Example 1 



Consider the elastic structure as shown in Fig. 4. In 
this example we consider the length L to be 1 metre. The 
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horizontal beam is subjected to a uniformly distributed load 
p 0 = 50 N/m. We wish to minimize the weight of the structure 
by varying the cross section in various ways. We initially 
consider two two-dimensional problems as shown in Fig. 5. In 
5 the first two-dimensional problem (Problem A) the two parts of 
the elastic structure have different square sections, while in 
the second two-dimensional problem (Problem B) the two parts of 
the elastic structure have the same rectangular section. We 
also consider a four-dimensional problem as shown in Fig. 6 in 

10 which the two parts of the elastic structure have different 
rectangular sections. In all cases the minimisation is carried 
out subject to the constraints 

a aax < 100000 N/m 2 (22) 
where o max is the maximum stress in the structure and 

15 0.05 m < ti < 0.1 m (23) 

where i respectively varies from one to two or from one to four 
for the two and four dimensional problems. 

The problem was analysed using a simple FE beam model. 
Two levels of complexity were considered: a coarse (low 

20 fidelity) model f a consisting of just 4 elements and a fine 
(high fidelity) model f e consisting of 100 elements. In these 
two models the objective V (volume is proportional to weight) 
remains the same whereas the stress, which forms the 
constraint, varies. It is this variation in stress between f e 

25 and f a that we attempt to model. 

2D Beam Problem 

Results were obtained from f e for nine comhiaat ions of t± 
30 and t 2 : (0.05, 0.05), (0.075, 0.05), (0.1, 0.05), (0.05, 
0.075), (0.075, 0.075), (0.1, 0.075), (0.05, 0.1), (0.075, 
0.1), (0.1, 0.1). This provided a set of training data 
containing nine sampled points in design space. 

The following seven approaches (referred to hereafter by 
35 the shortened terms in brackets) were used to solve this and 
subsequent problems: 
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(i) Low fidelity model optimisation (Cheap) 

(ii) Kriging the expensive data at the sampled points and 
optimising (Kriging) 

(iii) Kriging the difference f e - f a at the sampled points 
adding this to f a and optimising (Addition) 

(iv) Kriging the ratio f e /f a at the sampled points 
multiplying this by f a and optimising (Ratio) 

(v) Multif idelity modelling using a neural network and the 
weighted low fidelity model (KBNN - Knowledge-Based Neural 
Network approach) . 

(vi) Multif idelity modelling using a kriging model and the 
weighted low fidelity model (KBK - Knowledge-Based Kriging 
approach) 

(vii) Direct optimization of the high fidelity model 
(Expensive) 

Approaches (iii), (iv) and (vi) are in accordance with the 
first aspect of the present invention, approaches (v) and (vi) 
are in accordance with the second aspect of the present 
invention, and approaches (i) , (ii) and (vii) are provided for 
comparative purposes. It should be noted, however, that in 
many realistic situations direct optimization of a high 
fidelity model will not be feasible. 

In both the KBNN and the KBK models we consider the 
elements of Wx in the range [0.75, 1.10] and those of w 2 in the 
range [-0.025, 0.025]. In the KBNN neural network we take 
N b = N c = N r * = 3. The results for problem A are shown in 
Table I. Table I also lists the relative error (stress) in 
each model. This is an average error taken over 441 test 
points spread throughout the design space. The error was 
computed by taking results of the high fidelity model as exact. 
Table II lists the same results as Table I, but for problem B . 
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Table I 



Model 


ti 


t 2 


V 


Relative 










error 


Cheap 


5 x 10" 2 


6.7846 


x 8.139 


x IO" 3 1.837 x IO" 1 






IO" 2 






Kriging 


5 x icr 2 


7.3944 


x 9.003 


x 10" 3 9.267 x 10" 2 






lO" 2 






Addi tion 


5 x 10" 2 


7.2780 


x 8.832 


x 10" 3 2.418 x IO" 2 






icr 2 






Ra tio 


5 x 10" 2 


7.2645 


x 8.813 


x IO" 3 2.262 x 10" 3 






IO" 2 






KBNN 


5 x 10" 2 


7.2576 


x 8.803 


x 10" 3 2.8160 x IO" 4 






lO" 2 






KBK 


5 x 10" 2 


7.2576 


x 8.803 


x 10" 3 1.723 x IO" 3 






IO" 2 






Expensive 


5 x 10" 2 


7.2571 


x 8.802 


x 10" 3 N/A 






lO" 2 






Table II 


nouei 


ti 


t 2 


V 


Relative 










error 


Cheap 


5 x icr 2 


7 . 5101 


x 9.066 x IO" 3 1.811 x IO" 1 






io- 2 






Kriging 


5 x 10" 2 


8.4340 


x 1.0181 


x 6.736 x 10" 2 






10" 2 


IO' 2 




Addi tion 


5 x 10" 2 


8 . 3597 


x 1.0091 


x 1.281 x 10" 2 






IO" 2 


IO" 2 




Ra tio 


5 x 10" 2 


8.3376 


x 1.0064 


x 6 . 663 x 10" 5 






10" 2 


10~ 2 




KBNN 


5 x 10" 2 


8 . 3380 


x 1.0065 


x 8.7170 x IO" 6 






10" 2 


IO' 2 




KBK 


5 x 1CT 2 


8 . 3379 


x 1.0065 


x 1.5740 x IO" 5 






IO" 2 


IO" 2 




Expensive 


5 x 10" 2 


8 . 3379 


x 1.0065 


x N/A 
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io" 2 icr 2 



It is clear from these tables that modelling using the 
nine high fidelity model response points alone (Cheap, Kriging) 
leads to relatively large errors. 
5 Introducing knowledge in the form of a cheap approximation 

{Addition, Ratio) is beneficial as there is some degree of 
correlation between the models. We should expect this since 
the two models represent the same physical system. In the 
example, the Addition model performs worse than the Ratio 

10 model, but how these models perform relative to each other is 
highly problem dependent. 

The knowledge-based approaches using a weighted low 
fidelity model ( KBNN, KBK) perform better. As the methods 
provide more flexibility than modelling the difference and 

15 ratio alone, they are expected to outperform the Addition and 
Ratio models. For a given system it is not clear which of the 
KBNN and KBK approaches is likely to perform best, although the 
kriging-based approach is generally quicker to set up. 

Figs. 7a-c show respectively the objectives and constraint 

2 0 boundaries for the Cheap, Expensive and KBNN models for problem 
A. Similarly, Figs. 8a-c show respectively plots of the 
objectives and constraint boundaries for the Cheap, Expensive 
and KBK models for problem B. 

Each of Figs. 7a-c and Figs. 8a-c have ti and t 2 

25 respectively plotted along the horizontal and vertical axes and 
show contours of equal structural weight. Clearly the 
structural weight decreases as ti and t 2 are reduced. The 
areas shaded in black correspond to in feasible, designs (i.e. 
values of ti and t 2 for which the resulting stress is greater 

30 than the maximum allowable stress). 

The Expensive high fidelity models (Figs. 7b and 8b) 
produce the most accurate results, and the better the lower 
ccst model, the more closely it should replicate the shaded 
areas of Figs. 7b and 8b. Comparing Fig. 7a with Fig. 7b and 

35 Fig. 8a with Fig. 8b, the Cheap low fidelity models do not 
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reproduce well the shaded areas. In contrast, comparing Fig. 
7c with Fig. 7b and Fig. 8c with Fig. 8b, the shaded areas 
produced by the KBNN multif idelity model for problem A and the 
KBK multif idelity model for problem B are almost 
5 indistinguishable from those produced by the corresponding 
Expensive models. Thus the multif idelity models produce an 
accurate representation of the high fidelity models, but at 
considerably reduced computational cost. 

10 4D Beam Problem 

Turning to the 4D beam problem, the four parameters of the 
design space are the cross sectional properties of each beam. 
As training data we used 21 points obtained from the high 

15 fidelity model. These points representatively sampled design 
space. Solutions to the problem were sought using the Cheap, 
KBNN, KBK and Expensive models only. The results are shown in 
Table III. It should be noted that the high fidelity model 
optimisation required 185 expensive function evaluations using 

20 the L-BFGS-B optimizer of Zhu et al. (1994) to optimize the 
problem in this way compared to just 21 evaluations using the 
KBNN and KBK approaches of the present invention. 



Table III 
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Model 


ti 






t 2 




ts 




u 






V 






Cheap 


5 


X 


10 


- 5 x 


10" 


5 x 


10" 


1 . 9943 


X 


10" 


1 . 


5327 


x 10" 




2 






2 








2 






3 






KBNN 


5 


X 


10 


' 5 x 


10" 


5 x 


10" 


8.8470 


X 


10" 


7 . 


9590 


x 10" 




2 






2 




2 




2 






3 






KBK 


5 


X 


10 


* 5 x 


10" 


5 x 


10" 


8. 8576 


X 


10" 


7 . 


9643 


x 10" 




2 






2 




2 




2 






3 






Expensive 


5 


X 


10' 


" 5 x 


lo- 


5 x 


10" 


8 . 8427 


X 


10" 


7 . 


9569 


x 10" 




2 






2 




2 




2 






3 






The KBNN and 


KBK 


approaches again 


performed 


well : 


including 
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information from the low fidelity model led to predictions of 
the optimum which were very close to the true optimum in both 
cases . 

5 Example 2 

Next we consider the design of a tail bearing housing for 
an aero gas turbine engine. Again the objective is to minimize 
the weight of the structure whilst keeping the stress at a key 

10 point below a prescribed value of 2,0 N/ram 2 . 

The low fidelity model is shown in Fig. 9a. This model 
consists of 246 finite elements and requires solution of a 
system of 1470 equations. A much more sophisticated high 
fidelity model is shown in Fig. 9b. This model consists of 

15 11640 elements and requires solution of a system of 71064 
equations. The low fidelity model, which should be much 
quicker to solve than the high fidelity model, can be used as a 
guide to the behaviour of the high fidelity model. One might 
expect a well tuned solver dealing with banded matrices to 

20 scale with perhaps 0(N 2 ) , which this would give a ratio of run 
times of over 2000. However, because the models are, in 
absolute terms, both quite small, the savings are less because 
of the overheads associated with commercial finite element 
codes. We saw a ratio closer to 20. 

25 In the following work an extremely accurate surrogate of 

the low fidelity model was used in the knowledge layer. The 
surrogate was built using a standard kriging model but with a 
relatively large set of training data (500 evaluations for 4 
variables). The reason we used an accurate surrogate is 

30 twofold. Firstly we could then avoid software integration 
problems associated with linking our fortran code to the finite 
element solver. Secondly, it led to faster training times, 
because although the low fidelity model should be 
computationally much cheaper than the high fidelity model, due 

3 5 to the overheads involved with a commercial finite element code 
the difference proved not so great in practice. By utilising 
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an accurate surrogate in place of the low fidelity model we 
avoided this overhead during training of the knowledge based 
models (where the weighted low fidelity model implicitly 
influences the training procedure) . 

Four design variables define the structural geometry, and 
these were constrained within the following realistic bounds: 

1 mm < xi < 4 mm (2 4) 

2 mm < x 2 £ 5 mm 
2 mm < X3 ^ 5 mm 

2 mm ^ x 4 < 5 mm. 
The variables x\ to x 4 respectively relate to the thickness of 
the inner ring faces, inner ring thickness, outer ring 
thickness and spoke thickness. 

Initially 16 runs of the high fidelity model were made. 
Each run produced a point in design space comprising a set of 
the input parameters and the minimum weight and stress 
associated with these parameters. The points were chosen in 
order representatively to sample design space and were used for 
model training. 

Cheap, Kriging, Ratio, KBNN, KBK models were then used to model 
the component. For the purposes of assessing the models' 
accuracies, 484 further high fidelity (Expensive) model 
evaluations at alternative combinations of the input parameters 
were made (but not used in model training) . The results of the 
models were then compared with these evaluations. Table IV 
compares the results of the modelling. 
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Table IV 

Model Average % 

error 

Cheap 46.4919 

Kriging 2.4074 

Ratio 1.7040 

KBNN (3) 3.2215 

KBNN (5) 3.8932 

KBK 1.4251 



There was reasonably good correlation between the 
resulting stresses in the Expensive and Cheap models. However, 
5 the average error in minimum weight calculated by the Cheap 
model at the 484 additional points was large. The Kriging 
model led to a much reduced average error, and the Ratio model 
reduced the average error still further. However the KBK model 
led to the lowest error of all. 

10 Training the KBNN proved to be difficult in this example: 

we tried training a KBNN with 3 neurons per layer ( KBNN ( 3) - 43 
optimisation variables) as well as a KBNN with 5 neurons per 
layer (KBNN (5) - 85 optimisation variables) . In both cases we 
were unable fully to train the model, leading to generally poor 

15 results. This highlights the potential difficulties with the 
KBNN approach. In general relatively large amounts of 
(expensive) training data are required. It might also be that 
more neurons are required before an acceptable approximation 
can be obtained, but this would involve solving an even larger 

2-j optimisation problem during training. 

For both the KBK and KBN models the elements of Wi were 
chosen in the range [0.75, 1.10] and those of w 2 in the range 
[-0.25, 0.25] . 

The optimum design produced by the most accurate model 
2 5 [KBK) weighed 7 3.31 kg. The optimum design variables Xi to x 4 
were (2.0233, 2.0, 2.0, 2.0mm) and the stress took the value 
2.013 N/mm 2 , which is very close to our predefined maximum 
value of 2.0 N/mm 2 . 
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A direct optimization was then performed using the 
Expensive model. The resulting optimum design had a weight of 
73.58 kg. The optimum design variables were 

(2.0547, 2.0, 2-0, 2.0 mm) and the stress value was 1.972 
N/mm 2 . This required a total of 158 calls to the high fidelity 
model. Thus the KBK approach led to a good approximation of 
the optimum design but with a significant reduction in 
computational cost . 

Thus the examples show that multif idelity knowledge-based 
modelling approaches according to the present invention are 
more effective than standard response surface approaches built 
on expensive models alone. This is because the multif idelity 
models can provide good approximations with relatively little 
training data and can provide relatively accurate 
extrapolations . 

Furthermore, the multif idelity modelling provided improved 
accuracy on a global scale compared to the other methods 
described. Clearly Example 1 is somewhat simple, but does 
provide a benchmark result for comparing the various 
approaches. Example 2 demonstrates the approach on a more 
realistic problem. 

While the invention has been described in conjunction with 
the exemplary embodiments described above, many equivalent 
modifications and variations will be apparent to those skilled 
in the art when given this disclosure. Accordingly, the 
exemplary embodiments of the invention set forth above are 
considered to be illustrative and not limiting. Various 
changes to the described embodiments may be made without 
departing from the spirit and scope of the invention. 
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